SHARP: Harmonizing Cross-workflow Provenance

نویسندگان

  • Alban Gaignard
  • Khalid Belhajjame
  • Hala Skaf-Molli
چکیده

PROV has been adopted by a number of workflow systems for encoding the traces of workflow executions. Exploiting these provenance traces is hampered by two main impediments. Firstly, workflow systems extend PROV differently to cater for system-specific constructs. The difference between the adopted PROV extensions yields heterogeneity in the generated provenance traces. This heterogeneity diminishes the value of such traces, e.g. when combining and querying provenance traces of different workflow systems. Secondly, the provenance recorded by workflow systems tends to be large, and as such difficult to browse and understand by a human user. In this paper, we propose SHARP, a Linked Data approach for harmonizing cross-workflow provenance. The harmonization is performed by chasing tuple-generating and equalitygenerating dependencies defined for workflow provenance. This results in a provenance graph that can be summarized using domain-specific vocabularies. We experimentally evaluate the effectiveness of SHARP using a real-world omic experiment involving workflow traces generated by the Taverna and Galaxy systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SHARP: Harmonizing Galaxy and Taverna Workflow Provenance

SHARP is a Linked Data approach for harmonizing cross-workflow provenance. In this demo, we demonstrate SHARP through a real-world omic experiment involving workflow traces generated by Taverna and Galaxy systems. SHARP starts by interlinking provenance traces generated by Galaxy and Taverna workflows and then harmonize the interlinked graphs thanks to OWL and PROV inference rules. The resultin...

متن کامل

The Aspect-Oriented Architecture of the CAPS Framework for Capturing, Analyzing and Archiving Provenance Data

With aspect-oriented programming techniques, modularity may be achieved via separating cross-cutting concerns. Data provenance can be considered as a crosscutting concern: code for collecting provenance data is usually scattered across various places in a software system. Aspect-oriented programming allows to seamlessly integrate cross-cutting concerns into existing software applications withou...

متن کامل

A Model to Capture Interaction between Data Provenance and Workflow Provenance

Provenance means origin. There are two types of Provenance namely Data Provenance and Workflow Provenance. Data Provenance refers to the process of recording and tracking the source or origin of data while workflow provenance means history of workflows and their data that were used while performing an operation to get required result. The results which are obtained while performing operations a...

متن کامل

Provenance Collection Support in the Kepler Scientific Workflow System

In many data-driven applications, analysis needs to be performed on scientific information obtained from several sources and generated by computations on distributed resources. Systematic analysis of this scientific information unleashes a growing need for automated data-driven applications that also can keep track of the provenance of the data and processes with little user interaction and ove...

متن کامل

Abstract Provenance Graphs: Anticipating and Exploiting Schema-Level Data Provenance

Provenance Graphs: Anticipating and Exploiting Schema-Level Data Provenance Daniel Zinn Bertram Ludäscher {dzinn,ludaesch}@ucdavis.edu Abstract. Provenance graphs capture flow and dependency information recorded during scientific workflow runs, which can be used subsequently to interpret, validate, and debug workflow results. In this paper, we propose a new concept, called abstract provenance g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017